MHBase: A Distributed Real-Time Query Scheme for Meteorological Data Based on HBase
نویسندگان
چکیده
Meteorological technology has evolved rapidly in recent years to provide enormous, accurate and personalized advantages in the public service. Large volumes of observational data are generated gradually by technologies such as geographical remote sensing, meteorological radar satellite, etc. that makes data analysis in weather forecasting more precise but also poses a threat to the traditional method of data storage. In this paper, we present MHBase, (Meteorological data based on HBase (Hadoop Database), a distributed real-time query scheme for meteorological data based on HBase. The calibrated data obtained from terminal devices will be partitioned into HBase and persisted to HDFS (the Hadoop Distributed File System). We propose two algorithms (the Indexed Store and the Indexed Retrieve Algorithms) to implement a secondary index using HBase Coprocessors, which allow MHbase to provide high performance data querying on columns other than rowkey. Experimental results show that the performance of MHBase can satisfy the basic demands of meteorological business services.
منابع مشابه
Separating indexes from data: a distributed scheme for secure database outsourcing
Database outsourcing is an idea to eliminate the burden of database management from organizations. Since data is a critical asset of organizations, preserving its privacy from outside adversary and untrusted server should be warranted. In this paper, we present a distributed scheme based on storing shares of data on different servers and separating indexes from data on a distinct server. Shamir...
متن کاملDistributed Storage and Processing Method for Big Data Sensing Information of Machine Operation Condition
The traditional relational database cannot satisfy the requirements of the high speed and real-time storage and processing for the distributed Big Data sensing information in the Wide Area Network environment. In this context, the No-SQL database HBase is used to store the big data sensing information of machine operation condition collected by Fiber Bragg Grating sensor network. The distribute...
متن کاملH2RDF+: High-performance distributed joins over large-scale RDF graphs
The proliferation of data in RDF format calls for efficient and scalable solutions for their management. While scalability in the era of big data is a hard requirement, modern systems fail to adapt based on the complexity of the query. Current approaches do not scale well when faced with substantially complex, non-selective joins, resulting in exponential growth of execution times. In this work...
متن کاملDistributed RDF Triple Store Using HBase and Hive
The growth of web data has presented new challenges regarding the ability to effectively query RDF data. Traditional relational database systems efficiently scale and query distributed data. With the development of Hadoop its implementation of the MapReduce Framework along with HBase, a NoSQL data store, the semantics of processing and querying data has changed. Given the existing structure of ...
متن کاملA Fast and High Throughput SQL Query System for Big Data
Relational data query always plays an important role in data analysis. But how to scale out the traditional SQL query system is a challenging problem. In this paper, we introduce a fast, high throughput and scalable system to perform read-only SQL well with the advantage of NoSQL’s distributed architecture. We adopt HBase as the storage layer and design a distributed query engine (DQE) collabor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Future Internet
دوره 8 شماره
صفحات -
تاریخ انتشار 2016